Secure storage and access to sensitive data

ABSTRACT

A method and system for securely storing and accessing sensitive user data (e.g., personally identifying information or PII) is described. In an aspect, PII is divided into a plurality of separately stored data stores based on what type or field of PII are collected. Each piece of PII data or PII datum is associated with a unique code so as to form data pairs comprising the PII datum and the unique code associated with that PII datum. A tumbler data structure allows secure association of the unique codes for the PII data for each user. Once the tumbler data structure is unlocked, a provider can search and access the PII data of its users.

TECHNICAL FIELD

This invention relates to the storage of and access to sensitivepersonal information, which may be used to protect the confidentialityof sensitive data and to control access to the data.

BACKGROUND

There is a known need to collect and store data and information, forexample personal, financial, demographic and similar information toenable transactions requiring such information. Electronic commercialtransactions and a growing number of medical, educational and governmentservices rely on the provision of such sensitive data. Examples abound,for example in the conduct of online commerce in the purchase ofproducts and services over the Internet without the buyer and sellermeeting face to face to authenticate or assure the transaction. Wegenerally refer to all types of retailers, institutions, corporations orentities providing services, products or account access herein as“providers.” The consumers, clients, customers, buyers, subscribers orbeneficiaries receiving the products or services of the providers arereferred to herein as “users.” In general, both the providers and usersin the present context can be individuals, corporations or otherentities.

Conventionally, a provider offering products or services requiresinformation concerning the user of its products or services, includingfor example a user account name or number and a user password. Providerstypically collect personal and financial data, or data generallyidentifying a user that alone or in combination the user would notchoose to be known by unauthorized parties, to enable securetransactions and access to their services. We refer to such informationand data as personally identifying information (PII), which is commonlyof a private nature, including a user's name, address, telephone andemail address. This information can include further sensitive data aboutthe user such as the user's credit card numbers, date of birth, socialsecurity number, bank account information. Still further, thisinformation sometimes includes very personal and non-publicauthenticating information concerning the user as an individual toconfirm secure access by this person to an online transaction oraccount. This personal authenticating information can include a user'smother's maiden name, the name of the person's first pet, the color ofthe user's child's eyes, the make of their first car, city of birth, orother information presumed to prevent access by persons other than anauthorized user who has provided such authenticating personalinformation to the provider.

User information or PII is collected by the providers and stored tofacilitate later repeat visits by a user to a provider's site or storeand to enable payment for products and services delivered by theprovider to the user. The terms and conditions for taking a user'spersonal and financial data as described above typically include somepromise by the provider to securely store, handle and maintain theuser's data according to certain standards of care.

Failure to properly secure users' PII can have costly and embarrassingconsequences for the provider who loses or compromises the integrity ofits users' PII. For example, an online retailer who causes the loss ofsensitive consumer personal or financial data through negligence or byfailing to adequately protect it from criminal activity can face civiland criminal action and face restitution claims by the injured users whoentrusted the provider with their PII.

It is therefore clear that best practices and technology for securing,storing and accessing sensitive user information is necessary to enablethe kinds of transactions and relationships becoming the norm in ournetworked society. Providers who cannot reliably collect and store anddeliver access to sensitive user data cannot compete and may notlawfully participate in sensitive online financial, retail, medical,governmental and other operations offered to consumers and clientorganizations.

Present systems and methods for keeping sensitive user data generallyinvolve storing the user data in a secure database or electronic datastructure. The secure database is assumed to prevent access to thesensitive data by anyone other than the provider's authorizedadministrator of the database. Secure databases include databases ofuser information that are encrypted using passwords only available tosaid authorized administrators. However, even encrypted databases can becompromised by dedicated and/or skilled attacks by criminals, or bynegligence on the part of the database administrator.

Once compromised, a database can expose a vast amount of sensitiveinformation of the provider and its users. As an example, a hackedcustomer database can cause the unlawful release of millions of customerrecords into the hands of unauthorized persons who can then use theinformation to fraudulently make purchases in the name of the originalcustomers, sell the sensitive data without the permission of its owners,and so on. An analogy to conventional secure customer databases is toconsider a vault that contains a large amount of valuables thereinbelonging to many users who entrusted the provider of the vault withtheir personal and financial data. An illegal release of the key orcombination to the vault will thus result in the complete loss of thecontents of the vault, injuring all those whose property is kept in thevault.

Furthermore, even when not compromised, current methods for storing andaccessing large databases of sensitive data are inefficient and costlyto their providers because they require an undue amount of computerresources to maintain and operate. In one aspect, the conventionalsecure database needs to be unencrypted in its entirety to permit accessto any portion thereof or to permit any search of the database'scontents (which cannot be searched when the database is encrypted).Accordingly, conventional systems need to unencrypt whole large datasets or records, which is computationally difficult, in order to conducttransactions on the data in the database or to search the database.

FIG. 1 illustrates a simplified arrangement for storing and accessingsensitive data in a secure database according to the prior art. Database100 is presumably secured by an encryption scheme requiring a key 120 tounlock or unencrypt the database 100. The key 120 is only held by themanager of the database or the manager's designees and is presumedsecure from unauthorized access.

Sensitive data or PII 110 is kept as a data record, table or similardata store in database 100. Examples of PII stored in the database 100includes a user's first and last names 112, 113; the user's date ofbirth (DOB) 111; the user's home address 116; telephone number 115;Social Security Number (SSN) 114; email address 117; credit cardinformation 118; etc. It is clear that a user does not wish thissensitive information about him or her to be made public or fall intothe hands of unauthorized users. Many users 120 (e.g., all of thecustomers of a bank or retail store) have their data in database 100.Therefore, the encryption key 120 is the conventional system's primaryway to avoid unwanted loss of sensitive data 110. The full table of PII110 must be accessed and unencrypted each time any information from thedatabase is to be retrieved or searched.

FIG. 2 illustrates a conventional database 20. The database 20 istypically a monolithic computer data structure, file, or similar recordstored in a secure memory storage device and contains personallyidentifying information (PII) for a plurality N of users or members of agroup such as customers of a retail store. Each user U1, U2, . . . , UNhas a corresponding user record or user file 200, 210, . . . , 220associated therewith. Each record or file 200, 210, 220 of the users U1,U2, UN contains a plurality of PII. For example, for each respectiveuser the user's first name, last name, street address, town, state, zipcode, phone number, date of birth, social security number, membership IDnumber, credit card numbers, bank account numbers, or any other PIIincluding custom or proprietary PII characterizing the user can beincluded. We indicate a first PII of a first user as U1.PII1; a secondPII of the first user as U1.PII2; a third PII of the Nth user asUN.PII3; and so on. In this configuration, the database 20 is stored andaccessed as a whole, requiring the entire database 20 to be madeavailable to anyone reading, writing or searching its contents. Theentire database 20 is encrypted and unencrypted to secure and to accessthe database 20. A compromise of the encryption key for encryptingdatabase 20 would therefor result in the compromise of all of itscontents. An unauthorized person who unencrypts and accesses database 20could see all of the stored PII for the users U1, U2, . . . , UN,thereby injuring the privacy and security of each of the user members ofthe database 20. Furthermore, if the number of PIIs 202 kept in thedatabase 20 or the number of users N becomes large, the computationaleffort and cost required to constantly lock (encrypt) and unlock(unencrypt) database 20 becomes unreasonably large. Distributingdatabase 20 and its encryption key among administrators of the databaseincreases the likelihood of loss of the contents of database 20.

Improved systems and methods for collecting, storing and accessingsensitive data are required. Not only is the confidentiality andsecurity of the information a current issue, but the speed andcostliness of the operation of such systems are as well. The followingdisclosure provides such systems and methods, including preferredembodiments detailing exemplary operations thereof, which can be used ina large number of applications. The benefits of the following systemsand methods are applicable to industry, commerce, health care,government, education and other fields.

SUMMARY

An embodiment of the present invention is directed to a method forstoring and accessing personally identifying information (PII) includinga plurality of type fields of PII, comprising collecting a plurality ofPII data relating to a plurality of fields of PII for each user of aplurality of users, for each collected PII datum and for each of saidusers and for each type field of PII collected, associating a uniquecode with said PII datum so as to create unique data pairs, each datapair comprising said PII datum and its associated unique code, for eachof said type fields of PII collected, storing said data pairs relatingto a PII datum in said type field of PII in respective separate datastores so that the data pairs for the plurality of type fields of PIIare not all stored in a same data store but are separated by the typefield of PII to which they relate, and storing in a separate and securedata store a data structure that associates the unique codes for eachPII datum that are associated with a given user once said separate andsecure data store is unlocked.

Another embodiment is directed to a system including an applicationprocess coupled to a key store holding a security key (e.g., encryptionkey), a secure tumbler data structure requiring said security key tounlock, a vault and a bank data store. An example includes a system forstoring and accessing sensitive personally identifying information(PII), comprising a first data store comprising a plurality of datapairs of a first field type, each of said data pairs of the first fieldtype including a PII of said first field type and a corresponding uniqueidentifying code for each of said PII of the first field type; a seconddata store comprising a plurality of data pairs of a second field type,different from said first field type, each of said data pairs of thesecond field type including a PII of said second field type and acorresponding unique identifying code for each of said PII of the secondfield type; and a third data store comprising associating said uniqueidentifying codes of the first field type data pairs with said uniqueidentifying codes of the second field type data pairs so as to permituniquely associating the corresponding PII of said first types with thecorresponding PII of said second types.

BRIEF DESCRIPTION OF THE DRAWINGS

For a fuller understanding of the nature and advantages of the presentinvention, reference is made to the following detailed description ofpreferred embodiments and in connection with the accompanying drawings,in which:

FIG. 1 illustrates a conventional encrypted database of information;

FIG. 2 illustrates a conventional database of several users each havinga record containing several pieces of personally identifying information(PII);

FIG. 3 illustrates corresponding sets of personally identifyinginformation (PII) and GUIDs for a plurality of users;

FIGS. 4 and 5 illustrate exemplary data pairs of PII and GUIDs;

FIG. 6 illustrates an architecture according to embodiments of thepresent system and method;

FIG. 7 illustrates various PII of different field types for a pluralityof users;

FIG. 8 illustrates contents of a data store containing First nameinformation and uniquely identifying GUID codes;

FIG. 9 illustrates contents of a data store containing Postal Codeinformation and uniquely identifying GUID codes;

FIG. 10 illustrates contents of a data store containing Phone numbersand uniquely identifying GUID codes; and

FIG. 11 illustrates contents of a “Tumbler” data store associating GUIDcodes of various field types with one another.

DETAILED DESCRIPTION

The present disclosure is directed to systems and methods for securelyand efficiently storing and using personally identifying information(PII), which includes several fields of PII for a given user of aservice or offering provided by a provider. Compared to brute forceprior art approaches requiring encrypting and unencrypting largedatabases of user PII (e.g., databases of customer personal andfinancial data) the present systems and methods are more secure and aremore computationally efficient.

In an aspect, the present invention pairs each piece of PII data with acorresponding globally unique identification (GUID) code associated withthat piece of PII data. The GUID may be a random number of a certainform or length associated with a given piece of PII data. This piece ofPII data and its associated GUID code form a data pair. In anotheraspect, the present invention divides the storage of the (PII:GUID) datapairs into separate data stores, each data store containing informationrelating to a separate field of PII. In yet another aspect, theinvention stores a secure table or tumbler means that permitsassociation of the various GUID codes of a user so that the system cansearch for PII of a user based on a criterion.

FIG. 3 illustrates an architecture 30 for storing and accessing PIIaccording to the present invention. A plurality of databases or datastructures (generally data stores) 300, 310, 320 and 330 store acorresponding plurality of PII fields of different types for the samegroup of users. The data stores 300, 310, 320 and 330 may be representedas each being locked or secured by an encryption key 302, 312 322 and332 respectively, but the securement of each of the separate data storesis optional and can be flexibly customized per data store to suit agiven need. For example, data store 300 can store a name field PII 304of a set of users; data store 310 can store a telephone number field PII314 for the set of users; data store 320 can store a credit card numberfield PII 324 for the set of users; and data store 330 can store a dateof birth field PII 334 for the set of users. More data stores can beincluded in the system 30. However, the data stores while being able tobe kept physically or logically separated are accessible by the systemas needed. We now examine the way in which data is kept in the datastores 300, 310, 320, 330 of the present example.

FIG. 4 illustrates one of the PII field groups 304 of one of the abovedata stores 300, which can for example be a user “name” field PII. ThePII fields may be broken down into sub-fields as desired, and thesub-fields may be separated into discrete data stores or combined into asame data store, depending on the application at hand and the level ofsecurity desired. So, for example, the user or customer name may be keptin a same store including both first and last names, or it may be keptseparately in two stores, one containing first names and the anothercontaining last names. But the name(s) and other PII fields orsub-fields are separately stored in different databases or datastructures accessible to the provider or owner of the database system.

A security advantage of this architecture is that negligent or maliciouscracking (unlocking) of one data store would only expose one field ofPII (e.g., names but not other data; or phone numbers but not otherdata; or social security numbers but not other data; and so on). This isa security advance because unlock traditional secure databases as inFIG. 1 the present system does not provide access to all PII type fieldsfor any users even if one of the data stores of FIG. 3 is compromised.In fact, if some data in one or more stores of system 30 is not verysensitive it may not need to be encrypted and could even be stored asclear text or plain data. For example, type fields such as zip codes,employer addresses, length of time at current residence, or dates ofbirth, that taken without other PII information are of little value andmay not need to be stored in locked data stores. Storing non-sensitivedata type fields like this could speed the access time and search time,as encrypting and decrypting data requires costly computing resourcesbetter used for other purposes. Specifically, by leaving certain typesof PII unencrypted in their data stores, the present system and methodwould facilitate easier searching with lower computational requirementsthan searching a data store which is encrypted. More specifically, in anembodiment where one or more PII are not encrypted in their data stores,this method and system could easily perform generalized (e.g., wildcard)searching such as a search for “Da*” which would capture “Dave”, “David”and other matching results. This type of searching is not possible, oris much more complicated and costly in searching encrypted data storecontents.

The PII field group 304 of FIG. 4 (for example, the users' first namefield represented at PII_01) includes a plurality of data pairs wherebya PII information type field for a user is paired with a distinctidentifying number or code (GUID) to make a distinct PII-GUID data pair.For example, a data pair includes the PII field 400 and the GUID 402. Inthis example, the pairs include PII_01:GUID_01; PII_01:GUID_02; . . .PII_01:GUID_0N, which may represent the first name (PII_01) and itsassociated GUID_01 402 for a first user. Another data pair is PII_01 410and GUID_02 412 representing a second user's first name (PII_01) and itsassociated GUID_02 412.

Continuing this example, the PII field group 314 (for example, theusers' dates of birth PII_02) having for each user a data pair includingthe date of birth date data itself PII_02 for a user and an associatedGUID for that data for that user. The PII field group 314 in FIG. 5includes a data pair 500:502 representing the date of birthPII_02:GUID_11 of the first user; another data pair PII_02 510:GUID_12representing a second user's birth date PII_02 and the second user'sbirth date GUID_12 512, and so on.

It should be clear now that for as many fields or sub-fields of PII datafor as many users or subscribers or customers desired, any individualdata entry for any user is paired with a GUID code, preferably unique tothat user and PII field entry.

Since the present invention keeps the fields of PII data separated intoseparate data structures or databases, it is now time to describe howthe present system associates the various PII fields for a given userwith one another. That is, we now discuss how the system associates agiven user's name, address, phone number and bank account (or other PIIfields) with one another if the system's users' names are all in onestore, but all of the addresses are in another store, and the phonenumbers in yet another store and the bank accounts in still anotherstore. Recall that, as shown in FIG. 3, each field of PII is storedseparately in a data store. Providers (e.g., retailers, banks,healthcare providers, educational institutions) need to look up for agiven user what the user's PHs are generally and not just one field ofthe given user's PII, therefore merely unlocking one data store wouldonly unlock one field of PII. Accordingly, the provider will need amethod for associating several or all fields of the given user's PIIwith one another. For an abundance of clarity we continue our discussionin view of the examples of FIGS. 3, 4 and 5. Assume that the user names304 are encoded in data store 300 and their birth dates in data store310 and so on. Unencrypting data store 311 containing the set of userbirth dates 314 would not tell us which birth dates 314 belonged towhich user names 304, and vice versa. We present a way to associatethese PII fields 304, 314 (and others 324, 334, etc.) with one anotherso a provider can see for a given user what the user's complete PII is.An aspect of this invention uses the GUIDs associated with each PIIfield of each user as a way of coupling the separately stored PII fielddata with one another.

In an embodiment, the present system and method include a secure datastore containing the GUID codes for the various users' PII data fieldsand enable the system to determine which GUID codes from each of thedata stores of FIG. 3 belong to a same user in the way a mechanical lock“tumbler” operates to unlock the mechanical lock when the components ofthe lock are tumbled by an appropriate key to fall into place revealingthe whole PII picture for a given user. Once unlocked, the presenttumbler system and method permit an authorized provider with accesscodes to the tumbler database of GUID codes to associate the correctGUID codes with one another so that the provider can see the PII datafor a given user in their entirety or to the extent needed for a givenapplication. Meaning that for example a retail provider can look up andship to a given user his or her merchandise, using the user's correctname, address, credit card information and other data that can all beassociated by way of these fields' GUIDs and that are respectivelyassociated with the PII fields for the user as discussed above. Sincethe PII fields for the user are uniquely connected to their respectiveGUID codes, unlocking a GUID code database (tumbler) store will serve toconnect the corresponding PII field data for the user with one another.

It should be appreciated that the present system and method offercomputational efficiencies in addition to security advantages asmentioned above. The individual data stores of FIG. 3 are far easier tounlock separately (decrypt if encrypted) than to unlock an encryptedmonolithic database like the one of FIG. 1. This is especially easyunder the present method if one or more non-sensitive fields of data arestored without encrypting them in the first place. As mentioned, streetaddress numbers alone would be of such little value to an unauthorizeduser as to be potentially safe to store in plain form withoutencryption. Birth dates are also not very interesting to malicious thirdparties, as any date of the year could be posted publicly and is sure tobe someone's birthday, which does not give away any sensitiveinformation about anybody. Even first names might be stored withoutspecial care for their particular obscurity because a database of firstnames is unlikely to compromise any or most people. It is when theseparate PII fields are associated with one another (for example inknowing the name, address, and phone numbers of people) that thecombined fields become of some value or risk to their owners. Asmentioned before, there are computational and cost advantages to keepingat least some of the PII stored in unencrypted form so as to be able toeasily search through such data, perform wildcard searching on it, etc.

FIG. 6 illustrates an architecture 60 according to embodiments of thepresent system and method. An application process 600 is configured andoperated to exchange data with several components of the architecture60. These components include a key store 610 containing encryption keys612 or similar means for unlocking a secure (e.g., encrypted) datastore. Tumbler 620 contains a persistent data store 622 in which arestored tkey:ekey pairs of data. Vault 630 contains a persistent datastore 632 in which are stored data pairs ekey:VALUE and ekey:EVALUE.Also, vault 630 can include or be associated with a search index whichindexes the ekey:VALUE data pairs. Bank 640 includes a persistent datastore 642 holding data pairs pkey:VALUE as well as a search index whichindexes data pair VALUE:pkey.

Therefore, the application process 600 can store and retrieve encryptionor security keys 612 in key store 610. These keys are used to unlock thetumbler 620 and therefore reveal the associations between the GUID codescorresponding to the PII fields of a user so that the user's PII aregenerally unlocked upon demand by an authorized provider.

In an embodiment, the “ekey” can be an encrypted code, which isencrypted using the “tkey” as its encryption key. The PII data shown atFIGS. 8, 9, 10 may have different “ekey” assignments to the various PIIassociated with a given user so that if these data are compromised it isnot obvious which user the data relate to. Meaning, in an embodiment,the same user's Name, Phone, Email, Address, etc. each have a differentencrypted “ekey” that are derived from encrypting a same “tkey”. The“tkey” being the thing that connects the relationships between thevarious encryption results for that given user's PII and theirrespective encrypted ekeys. As will be discussed below, by encryptingthe “ekeys” there is no easy way for unauthorized persons to determinewhich “ekeys” for which PII pieces belong to one another without the“Tumbler” key (“tkey”) to tell the system which “ekeys” belong with oneanother.

Therefore, the “ekeys” can be encrypted (either in a PII data store orin the Tumbler), which will avoid unwanted exposure of the connectednessof the associated PII by anyone not in possession of the instant Tumblerkey.

Referring to the shaded boxes representing information passing betweenthe components of system 60, these represent exemplary steps ofcommunicating information according to an exemplary embodiment for usingthe system 60. In step 1 the Application Process 600 makes a call to theKeystore 610 to request encryption keys for a given customer. If theApplication Process' request is authentic, Keystore 610 returns therequested encryption key 612 to the Application Process 600. TheApplication Process 600 takes the provided encryption key 612 and usesthis to access secure Vault 630, which contains a data store 632 havingPII information for a user or users who are the subject of a given queryat step 3.

Once information matching a query are found, the PII:GUID data pairs andinformation from vault store 632 matching the query are returned fromVault 630 to Application Process 600 at step 4. The PII retrieved by theApplication Process 600 from the Vault store 632 can include results ofa query for a certain name, address, age, bank account information andother PII, which are returned to the Application Process 600 along withthe corresponding GUID codes for each of these pieces of PII asdescribed above with respect to FIGS. 4 and 5.

A query may for example call for all of the customers of a certainmerchant whose first names are “Dave” and whose address includes“Seattle.” As noted before, this information alone, even if it were tofall into the wrong hands, is of little value as many people are calledDave and many addresses contain the word Seattle. Even where other PIIsuch as social security numbers are exposed, they remain reasonablyharmless because at this stage the information is just in raw format andis not sorted or arranged so that an unauthorized person could determinewhich Dave lived at which Seattle address and had which bank account,and so on. That organization to sort the pieces of discrete PII intosame-person rows is done next by the Tumbler 620 component.

Now the system 60 sends all of the PII information and GUID code datapairs retrieved from Vault data store 630 to Tumbler 620 for sortinginto meaningful data sets associated with particular users. Tumbler 620has a Tumbler store 622 which holds in it the keys to unlocking,analogous to the tumbler of a mechanical lock, the match between variousPII information using the PII:GUID data pairs of the information. Sousing our mechanical analogy only to the limited extent of thisstatement, the Tumbler 620 can identify the GUID codes of the PHs thatbelong to the same set of PII associated with a given user. The Tumbler620 acts as the unscrambling element that can sort the scrambled PIIinformation into meaningful sets so that the operator can then see inresponse to its query what the proper name, address, phone number, bankaccount number, age and other PII of a given user. This sorted set ofPII is returned at step 6 to the Application Process 600 and theoperator of the process can intelligibly see the sorted PII informationfor users named “Dave” living at “Seattle” in the previous example.

The Bank 640 contains a Bank data store 642, which holds otherinformation relating to a customer, user, transaction or any other datathat can be associated with the query, but is not necessarily of ahighly sensitive nature. For example, the Bank store 642 could holdoffers, discount codes or other electronic files, images or articles ofcommerce usable by Application Process 600 in the course of itsoperation. Steps A and B represent requests and replies to and from Bank640. The Bank 640 is not a necessary component of all embodiments of thepresent invention. Likewise, one or more of the above components can beimplemented by those skilled in the art differently from that describedin the present illustrative examples. The arrangement and keeping andarchitecture of the components of FIG. 6 can be kept as most suitablefor a given operator of the system 60.

In an aspect, the Keystore 620 is a single-tenant multi-instancecomponent, the Tumbler 620 is a single-tenant multi-instance component,the Vault 630 is a multi-tenant single-instance component, and the Bank640 is a multi-tenant single-instance component. By single-tenant it ismeant that a single customer's data is kept in the data store.Multi-tenant means multiple customers' data can be kept in the datastore. For very sensitive information such as the Keystore encryptionkeys, system 60 may implement the system as a single-tenant Keystore 610for added security. However, the Vault 630 might in an embodiment be amulti-tenant component so as to keep the information from the users ofmore than one customer of the system 60. For example, if the system 60is operated by its owner/operator to service a plurality of onlineretailers, each of which is a customer of the operator of system 60,there are normally a large number of users of each of thesetenants/customers. So a single-tenant component in this instance means adata store (e.g., 612) that is dedicated to the users of a singletenant/customer of system 60. Multi-tenant components (e.g., Vault 630)can be split into more single-tenant components or into a plurality ofother smaller multi-tenant components as desired. Having a multi-tenantcomponent (e.g., Vault 630) allows the system 60 to scale up to verylarge data stores containing a large amount of information. In anaspect, having more data in the data stores of system 60 reduces thesignificance and risk of exposure of any one or few or several pieces ofPII.

By multi-instance it is meant that there is a plurality (usually many)pieces of data stored in a component, even if the component issingle-tenant. So for example the phone numbers of a large number ofcustomers of an online retailer can be stored in a multi-instance datastore component.

The architecture of exemplary FIGS. 3 and 6 could be implemented so thatthere is no direct connection between one data store and the next. Forexample, referring to FIG. 6, the system may be implemented so that thedata stores or modules 610, 620, 630, 640 are substantially isolatedfrom one another and only relay information to and from the applicationprocess 600. This can increase the security of the system 60 so as tominimize the likelihood of global loss of the PII:GUID data pairs andthe likelihood of compromising the owners of that information.Furthermore, this allows the manager of the system to monitor itsoperation to avoid misuse of the system.

FIG. 7 illustrates a table 70 of some personal information relating to aplurality of users (for example shoppers of an online retailer).Obviously this is given as a simplified example and in use the table mayinclude much more information, for example financial data and otherpersonal data for each user. The information can be visualized as beingtabular and in column and row format, but those skilled in the art willappreciate that a data structure can be of other formats and can bemulti-dimensional and can be in any representation that suits a givenpurpose and computer architecture. For illustration, we depict severalcolumns of PII data such as First name 710, Last name 720, Phone number730, Street address 740, and Postal code 750.

Note that the pieces of PII can be scrambled and require sorting so asto arrange the pieces of PII properly to correspond with a particularuser. In this example, we show shaded blocks indicating a user namedDavid Takashi whose phone number is 1.617.847.8489 and whose streetaddress includes 45 High Pass and whose postal code is 02156. However,these pieces of PII are not generally stored in an organized form in adatabase so that if the database is compromised there is no clear way toknow which pieces of PII belong to which other pieces. By associatingeach piece of PII with a unique GUID in a PII:GUID pair and using thepresent system and method, including the Tumbler described herein, it ispossible to properly sort and associate the various pieces of PII into ameaningful search result (e.g., for a person named David having a postalcode 02156, and so on).

FIG. 8 illustrates how a data pair of a piece of PII and its uniqueassociated GUID are kept according to an embodiment. These data pairscomprising PII:GUID are generally kept in a table 80 and kept in a datastore which can be secured to further prevent unwanted disclosure. Inaddition, the table 80 stores the Tenant 830, which for example in thisquery is “aaa.” The Tumbler component described above will use the GUID820 (also herein “ekey”) associated with a given piece of PII 810 toassociate the PII of a given user or query subject together.

FIG. 9 illustrates a similar table 90 as that described above, includingthe Postal code 910 data and GUID that has been encrypted (e.g., ekey)920 associated with that PII, as well as the Tenant 930 for the query inquestion.

FIG. 10 illustrates another table 1000 as that described above,including a Phone number 1010 and its unique GUID that has beenencrypted (e.g., ekey) 1020 and Tenant ID 1030. Such tables or datastructures can be made for each or subsets of each type of PII.

FIG. 11 illustrates the contents of a Tumbler data store 1100 like thosedescribed earlier. A Tumbler key or “tkey” 1110 is associated with agiven GUID or “ekey” 1120 and a “Field” 1130 representing the type fieldof the PII. So the types of PII “First” (FIG. 8), “Postal Code” (FIG. 9)and “Phone” (FIG. 10) are referred to as a type fields of PII for thepresent purpose.

In one aspect, the “tkey” above can be analogized to a row in aspreadsheet with each piece of PII sharing a same “tkey” for any givendata record for a user. Therefore, if two pieces of PII share a same“tkey” it would be understood that these two pieces of PII belong in thesame data record for the user. Therefore, since the associated “ekeys”for the ekey:PII data pairs are encrypted (either in the PII datastoreor in the Tumbler datastore) there is no way to elucidate that the twopieces of PII are part of the same record without authorization (by wayof the Tumbler “tkey” acting as a row ID in the above analogy).

The present invention should not be considered limited to the particularembodiments described above. Various modifications, equivalentprocesses, as well as numerous structures to which the present inventionmay be applicable, will be readily apparent to those skilled in the artto which the present invention is directed upon review of the presentdisclosure.

What is claimed is:
 1. A method for computer storage and access ofpersonally identifying information (PII) including a plurality of fieldsof PII, comprising: collecting a plurality of PII data relating to aplurality of respective fields of PII for each user of a plurality ofusers; associating a unique identifying code with each said PII datumfor each said field of PII so as to create unique data pairs, each datapair comprising said PII datum and its associated unique identifyingcode, where each unique identifying code is a globally uniqueidentification (GUID) that is unique to the field of PII to which thatPII datum belongs and to the user to which that PII datum relates;storing said data pairs in one or more computer data stores withoutorganization from which can be determined which data pairs comprisingPII data for a given user belong to which other data pairs for that sameuser; securing in a further separate computer data store a datastructure that associates with each said user the unique identifyingcodes of the data pairs that comprise PII data for that user; subjectingthe PII data in the one or more computer data stores to one or morequeries; returning data pairs from the one or more computer data storesmatching the one or more queries, where those returned data pairs (i)comprise PII data for a plurality of said users, (ii) are withoutorganization from which can be determined which data pairs comprisingPII data for a given user belong to which other data pairs for that sameuser; and utilizing the further separate computer data store toidentify, from the unique identifying codes in the returned data pairs,which PII data for a given user in those returned data pairs belongs towhich other PII data in those returned data pairs for that same user. 2.The method of claim 1, further comprising unlocking said furtherseparate computer data store so as to identify unique identifying codesfor a given said user.
 3. The method of claim 2, wherein the furtherseparate computer data store is encrypted with a key that differs for akey with which one or more of the data stores in which data pairs arestored is encrypted.
 4. The method of claim 1, wherein the step ofstoring data pairs in one or more computer data stores further comprisesstoring data pairs having PII data relating to at least one field of PIIin a said computer data store that is separate from a said computer datastore in which data pairs having PII data relating to at least one otherfield of PII are stored.
 5. A computer storage and access system forstoring and accessing sensitive personally identifying information (PII)data for a plurality of users, comprising: a first computer data storestoring a plurality of data pairs of a first field type, each of saiddata pairs of the first field type including a PII datum of said firstfield type for a respective user and a corresponding unique identifyingcode for that datum and user, where each unique identifying code is aglobally unique identification (GUID) that is unique to the field of PIIto which that datum belongs and to that user; a second computer datastore storing a plurality of data pairs of a second field type,different from said first field type, each of said data pairs of thesecond field type including a PII datum of said second field type for arespective user and a corresponding unique identifying code for thatdatum and user, where each unique identifying code is a globally uniqueidentification (GUID) that is unique to the field of PII to which thatdatum belongs and to that user; and a third computer data store storinga plurality of data structures associating said unique identifying codesof the first field type data pairs with said unique identifying codes ofthe second field type data pairs so as to permit uniquely associatingthe corresponding PII datum of said first type with the correspondingPII datum of said second type for a said respective user subjecting thePII data in the first and second computer data stores to one or morequeries; returning data pairs from the first and second computer datastores matching the one or more queries, where those returned data pairs(i) comprise PII data for a plurality of said users, (ii) are withoutorganization from which can be determined which data pairs comprisingPII data for a given user belong to which other data pairs for that sameuser; and utilizing the third computer data store to identify, from theunique identifying codes in the returned data pairs, which PII data fora given user in those returned data pairs belongs to which other PIIdata in those returned data pairs for that same user.